Vocalic sandwich, a unit designed for unit selection TTS
نویسندگان
چکیده
Unit selection text-to-speech systems currently produce very natural synthetic sentences by concatenating speech segments from a large database. Recently, increasing demand for designing high quality voices with less data creates need for further optimization of the textual corpus recorded by the speaker. The optimization process of this corpus is traditionally guided by the coverage rate of well-known units: triphones, words... Such units are however not dedicated to concatenative speech synthesis; they are of general use in speech technologies and linguistics. In this paper, we describe a new unit which takes account of concatenative TTS own features: the "vocalic sandwich." Both an objective and a perceptual evaluation tend to show that vocalic sandwiches are appropriate units for corpus design.
منابع مشابه
On the Suitability of Vocalic Sandwiches in a Corpus-Based TTS Engine
Unit selection speech synthesis systems generally rely on target and concatenation costs for selecting the best unit sequence. The role of the concatenation cost is to insure that joining two voice segments will not cause any acoustic artefact to appear. For this task, acoustic distances (MFCC, F0) are typically used but in many cases, this is not enough to prevent concatenation artefacts. Amon...
متن کاملPhonetically enriched labeling in unit selection TTS synthesis
Unit selection techniques have improved the quality of textto-speech (TTS) synthesis. However, mistakes which had been less noticeable previously in poorer quality synthetic speech become very noticeable in more natural-sounding synthetic speech. Many problems appear to be caused by mismatches between phones requested by the TTS frontend and phones selected from the labeled speech inventory. Gi...
متن کاملEfficiency Assessment of Acoustic Cabin for Providing Acoustic Comfort in Turbine Unit of a Thermal Power Plant
Background and Objective: A practical method for noise control in environments with different noise sources is designing an acoustic cabin for the workers. In this regard, this study aimed to assess the efficiency of the acoustic cabin in a typical turbine unit of a thermal power plant to provide acoustic comfort. Materials and Methods: Measurement of the noise level and spectrum, as well as v...
متن کاملEvaluation of Finnish unit selection and HMM-based speech synthesis
Unit selection and hidden Markov model (HMM) based synthesis have become the dominant techniques in text-to-speech (TTS) research. In this work, we combine HMM-based signal generation with the front end originally designed for unit selection based Finnish TTS and we evaluate the prosody of the output generated by the two synthesis techniques using the same speech database. Furthermore, we study...
متن کاملA hybrid TTS between unit selection and HMM-based TTS under limited data conditions
The intelligibility of HMM-based TTS can reach that of the original speech. However, HMM-based TTS is far from natural. On the contrary, unit selection TTS is the most-natural sounding TTS currently. However, its intelligibility and naturalness on segmental duration and timing are not stable. Additionally, unit selection needs to store a huge amount of data for concatenation. Recently, hybrid a...
متن کامل